Robust Boosting via Convex Optimization: Theory and Applications

نویسنده

  • Gunnar Rätsch
چکیده

In this work we consider statistical learning problems. A learning machine aims to extract information from a set of training examples such that it is able to predict the associated label on unseen examples. We consider the case where the resulting classification or regression rule is a combination of simple rules – also called base hypotheses. The so-called boosting algorithms iteratively find a weighted linear combination of base hypotheses that predict well on unseen data. We address the following issues: The statistical learning theory framework for analyzing boosting methods. We study learning theoretic guarantees on the prediction performance on unseen examples. Recently, large margin classification techniques emerged as a practical result of the theory of generalization, in particular Boosting and Support Vector Machines. A large margin implies a good generalization performance. Hence, we analyze how large the margins in boosting are and find an improved algorithm that is able to generate the maximum margin solution. How can boosting methods be related to mathematical optimization techniques? To analyze the properties of the resulting classification or regression rule, it is of high importance to understand whether and under which conditions boosting converges. We show that boosting can be used to solve large scale constrained optimization problems, whose solutions are well characterizable. To show this, we relate boosting methods to methods known from mathematical optimization, and derive convergence guarantees for a quite general family of boosting algorithms. How to make Boosting noise robust? One of the problems of current boosting techniques is that they are sensitive to noise in the training sample. In order to make boosting robust, we transfer the soft margin idea from support vector learning to boosting. We develop theoretically motivated regularized algorithms that exhibit a high noise robustness. How to adapt boosting to regression problems? Boosting methods are originally designed for classification problems. To extend the boosting idea to regression problems, we use the previous convergence results and relations to semi-infinite programming to design boosting-like algorithms for regression problems. We show that these leveraging algorithms have desirable theoretical and practical properties. Can boosting techniques be useful in practice? The presented theoretical results are guided by simulation results either to illustrate properties of the proposed algorithms or to show that they work well in practice. We report on successful applications in a non-intrusive power monitoring system, chaotic time series analysis and a drug discovery process. Preface I started my research …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boosting in the presence of outliers: adaptive classification with non-convex loss functions

This paper examines the role and efficiency of the non-convex loss functions for binary classification problems. In particular, we investigate how to design a simple and effective boosting algorithm that is robust to the outliers in the data. The analysis of the role of a particular non-convex loss for prediction accuracy varies depending on the diminishing tail properties of the gradient of th...

متن کامل

Characterizing Robust Solution Sets of Convex Programs under Data Uncertainty

This paper deals with convex optimization problems in the face of data uncertainty within the framework of robust optimization. It provides various properties and characterizations of the set of all robust optimal solutions of the problems. In particular, it provides generalizations of the constant subdifferential property as well as the constant Lagrangian property for solution sets of convex ...

متن کامل

A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives

Boosting [6,9,12,15,16] is an extremely successful and popular supervised learning technique that combines multiple “weak” learners into a more powerful “committee.” AdaBoost [7, 12, 16], developed in the context of classification, is one of the earliest and most influential boosting algorithms. In our paper [5], we analyze boosting algorithms in linear regression [3,8,9] from the perspective o...

متن کامل

Strong Duality in Robust Convex Programming: Complete Characterizations

Abstract. Duality theory has played a key role in convex programming in the absence of data uncertainty. In this paper, we present a duality theory for convex programming problems in the face of data uncertainty via robust optimization. We characterize strong duality between the robust counterpart of an uncertain convex program and the optimistic counterpart of its uncertain Lagrangian dual. We...

متن کامل

Event-driven and Attribute-driven Robustness

Over five decades have passed since the first wave of robust optimization studies conducted by Soyster and Falk. It is outstanding that real-life applications of robust optimization are still swept aside; there is much more potential for investigating the exact nature of uncertainties to obtain intelligent robust models. For this purpose, in this study, we investigate a more refined description...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001